-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: move bsearch function to C-code #2945
base: master
Are you sure you want to change the base?
Conversation
d975a4c
to
9e3c7bc
Compare
9e3c7bc
to
e045979
Compare
29c1f4e
to
618c10f
Compare
@emanuele6 I think this is ready |
The
|
4866769
to
949b647
Compare
@emanuele6 @itchyny I've just refactored this, and merging should be okay. |
50fc13a
to
569da27
Compare
I found a regression. On updating tests see my review comment above.
|
a77f02a
to
42135ed
Compare
I think we can reduce special cases, simplify the logic. static jv f_bsearch(jq_state *jq, jv input, jv target) {
if (jv_get_kind(input) != JV_KIND_ARRAY) {
return type_error(input, "cannot be searched from");
}
int start = 0;
int end = jv_array_length(jv_copy(input));
jv answer = jv_invalid();
while (start < end) {
int mid = (start + end) / 2;
int result = jv_cmp(jv_copy(target), jv_array_get(jv_copy(input), mid));
if (result == 0) {
answer = jv_number(mid);
break;
} else if (result < 0) {
end = mid;
} else {
start = mid + 1;
}
}
if (!jv_is_valid(answer)) {
answer = jv_number(-1 - start);
}
jv_free(input);
jv_free(target);
return answer;
} |
42135ed
to
95cbcc5
Compare
/* If the input is not sorted, bsearch will terminate but with irrelevant results. */ | ||
static jv f_bsearch(jq_state *jq, jv input, jv target) { | ||
if (jv_get_kind(input) != JV_KIND_ARRAY) { | ||
return type_error(input, "cannot be searched from"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you are leaking target; you need to free it before returning.
Otherwise looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also add a test with a non-array input, and a target that uses allocated memory (e.g. a string), so this can be caught by the CI?
I just remembered that |
359f0e2
to
2667c55
Compare
2667c55
to
df31443
Compare
This commit fixes issue jqlang#527 and move the bsearch function to a native C-code. The performance is a bit better: Testing script: ```bash clear if [[ `uname` == Darwin ]]; then MAX_MEMORY_UNITS=KB else MAX_MEMORY_UNITS=MB fi export TIMEFMT='%J %U user %S system %P cpu %*E total'$'\n'\ 'avg shared (code): %X KB'$'\n'\ 'avg unshared (data/stack): %D KB'$'\n'\ 'total (sum): %K KB'$'\n'\ 'max memory: %M '$MAX_MEMORY_UNITS''$'\n'\ 'page faults from disk: %F'$'\n'\ 'other page faults: %R' echo "JQ code bsearch" time /usr/bin/jq -n '[range(30000000)] | bsearch(3000)' echo "C code bsearch" time ./jq -n '[range(30000000)] | bsearch(3000)' ```` Results: ``` JQ code bsearch 3000 /usr/bin/jq -n '[range(30000000)] | bsearch(3000)' 8.63s user 0.77s system 98% cpu 9.542 total avg shared (code): 0 KB avg unshared (data/stack): 0 KB total (sum): 0 KB max memory: 823 MB page faults from disk: 1 other page faults: 432828 C code bsearch 3000 ./jq -n '[range(30000000)] | bsearch(3000)' 8.44s user 0.74s system 99% cpu 9.249 total avg shared (code): 0 KB avg unshared (data/stack): 0 KB total (sum): 0 KB max memory: 824 MB page faults from disk: 0 other page faults: 432766 ``` The results may be better if we can use jvp_array_read, and there is no need to copy/free the input array in each iteration. I guess that is like that for API pourposes when the libjq is in use with multiple threads in place. Signed-off-by: Eloy Coto <[email protected]>
Co-authored-by: itchyny <[email protected]> Signed-off-by: Eloy Coto <[email protected]>
Signed-off-by: Eloy Coto <[email protected]>
df31443
to
5c0aaba
Compare
This commit fixes issue #527 and move the bsearch function to a native C-code.
The performance is a bit better:
Testing script:
Results:
The results may be better if we can use jvp_array_read, and there is no need to copy/free the input array in each iteration. I guess that is like that for API pourposes when the libjq is in use with multiple threads in place.
Closes #527